sample compression
Private Distribution Learning with Public Data: The View from Sample Compression
We study the problem of private distribution learning with access to public data. In this setup, which we refer to as *public-private learning*, the learner is given public and private samples drawn from an unknown distribution $p$ belonging to a class $\mathcal Q$, with the goal of outputting an estimate of $p$ while adhering to privacy constraints (here, pure differential privacy) only with respect to the private samples. We show that the public-private learnability of a class $\mathcal Q$ is connected to the existence of a sample compression scheme for $\mathcal Q$, as well as to an intermediate notion we refer to as \emph{list learning}. Leveraging this connection: (1) approximately recovers previous results on Gaussians over $\mathbb R^d$; and (2) leads to new ones, including sample complexity upper bounds for arbitrary $k$-mixtures of Gaussians over $\mathbb R^d$, results for agnostic and distribution-shift resistant learners, as well as closure properties for public-private learnability under taking mixtures and products of distributions. Finally, via the connection to list learning, we show that for Gaussians in $\mathbb R^d$, at least $d$ public samples are necessary for private learnability, which is close to the known upper bound of $d+1$ public samples.
- Asia > Middle East > Israel (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper studies the problem of finding a small subset S* of the a set S of labelled points such that the 1-Nearest Neighbour classifier is consistent with S. The motivation is to speed-up 1NN classification of new points. The problem of finding a minimal set S* is known to be NP-hard, so the paper is concerned with approximations. Apparently all the previous results on the problems concerned heuristics. The present papers presents an algorithm whose approximation is shown to be optimal, in the sense that doing significantly better is NP-hard.
Sample Compression for Continual Learning
Comeau, Jacob, Bazinet, Mathieu, Germain, Pascal, Subakan, Cem
Continual learning algorithms aim to learn from a sequence of tasks, making the training distribution non-stationary. The majority of existing continual learning approaches in the literature rely on heuristics and do not provide learning guarantees for the continual learning setup. In this paper, we present a new method called 'Continual Pick-to-Learn' (CoP2L), which is able to retain the most representative samples for each task in an efficient way. The algorithm is adapted from the Pick-to-Learn algorithm, rooted in the sample compression theory. This allows us to provide high-confidence upper bounds on the generalization loss of the learned predictors, numerically computable after every update of the learned model. We also empirically show on several standard continual learning benchmarks that our algorithm is able to outperform standard experience replay, significantly mitigating catastrophic forgetting.
- North America > Canada (0.14)
- Oceania > Australia (0.14)
- North America > United States > Oregon (0.14)
- (3 more...)
Near-optimal sample compression for nearest neighbors
Lee-Ad Gottlieb, Aryeh Kontorovich, Pinhas Nisnevitch
We present the first sample compression algorithm for nearest neighbors with nontrivial performance guarantees. We complement these guarantees by demonstrating almost matching hardness lower bounds, which show that our bound is nearly optimal. Our result yields new insight into margin-based nearest neighbor classification in metric spaces and allows us to significantly sharpen and simplify existing bounds. Some encouraging empirical results are also presented.
Sample Compression Hypernetworks: From Generalization Bounds to Meta-Learning
Leblanc, Benjamin, Bazinet, Mathieu, D'Amours, Nathaniel, Drouin, Alexandre, Germain, Pascal
Reconstruction functions are pivotal in sample compression theory, a framework for deriving tight generalization bounds. From a small sample of the training set (the compression set) and an optional stream of information (the message), they recover a predictor previously learned from the whole training set. While usually fixed, we propose to learn reconstruction functions. To facilitate the optimization and increase the expressiveness of the message, we derive a new sample compression generalization bound for real-valued messages. From this theoretical analysis, we then present a new hypernetwork architecture that outputs predictors with tight generalization guarantees when trained using an original meta-learning framework. The results of promising preliminary experiments are then reported.
- North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
- North America > Canada (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (8 more...)
Private Distribution Learning with Public Data: The View from Sample Compression
We study the problem of private distribution learning with access to public data. In this setup, which we refer to as *public-private learning*, the learner is given public and private samples drawn from an unknown distribution p belonging to a class \mathcal Q, with the goal of outputting an estimate of p while adhering to privacy constraints (here, pure differential privacy) only with respect to the private samples. We show that the public-private learnability of a class \mathcal Q is connected to the existence of a sample compression scheme for \mathcal Q, as well as to an intermediate notion we refer to as \emph{list learning}. Leveraging this connection: (1) approximately recovers previous results on Gaussians over \mathbb R d; and (2) leads to new ones, including sample complexity upper bounds for arbitrary k -mixtures of Gaussians over \mathbb R d, results for agnostic and distribution-shift resistant learners, as well as closure properties for public-private learnability under taking mixtures and products of distributions. Finally, via the connection to list learning, we show that for Gaussians in \mathbb R d, at least d public samples are necessary for private learnability, which is close to the known upper bound of d 1 public samples.
Nearest-Neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions
Aryeh Kontorovich, Sivan Sabato, Roi Weiss
We examine the Bayes-consistency of a recently proposed 1-nearest-neighbor-based multiclass learning algorithm. This algorithm is derived from sample compression bounds and enjoys the statistical advantages of tight, fully empirical generalization bounds, as well as the algorithmic advantages of a faster runtime and memory savings. We prove that this algorithm is strongly Bayes-consistent in metric spaces with finite doubling dimension -- the first consistency result for an efficient nearest-neighbor sample compression scheme. Rather surprisingly, we discover that this algorithm continues to be Bayes-consistent even in a certain infinitedimensional setting, in which the basic measure-theoretic conditions on which classic consistency proofs hinge are violated. This is all the more surprising, since it is known that k-NN is not Bayes-consistent in this setting. We pose several challenging open problems for future research.
- Asia > Middle East > Israel (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Near-optimal sample compression for nearest neighbors
We present the first sample compression algorithm for nearest neighbors with nontrivial performance guarantees. We complement these guarantees by demonstrating almost matching hardness lower bounds, which show that our bound is nearly optimal. Our result yields new insight into margin-based nearest neighbor classification in metric spaces and allows us to significantly sharpen and simplify existing bounds. Some encouraging empirical results are also presented.
Bounded Memory Active Learning through Enriched Queries
Hopkins, Max, Kane, Daniel, Lovett, Shachar, Moshkovitz, Michal
The explosive growth of easily-accessible unlabeled data has lead to growing interest in active learning, a paradigm in which data-hungry learning algorithms adaptively select informative examples in order to lower prohibitively expensive labeling costs. Unfortunately, in standard worst-case models of learning, the active setting often provides no improvement over non-adaptive algorithms. To combat this, a series of recent works have considered a model in which the learner may ask enriched queries beyond labels. While such models have seen success in drastically lowering label costs, they tend to come at the expense of requiring large amounts of memory. In this work, we study what families of classifiers can be learned in bounded memory. To this end, we introduce a novel streaming-variant of enriched-query active learning along with a natural combinatorial parameter called lossless sample compression that is sufficient for learning not only with bounded memory, but in a query-optimal and computationally efficient manner as well. Finally, we give three fundamental examples of classifier families with small, easy to compute lossless compression schemes when given access to basic enriched queries: axis-aligned rectangles, decision trees, and halfspaces in two dimensions.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)